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Much of the trading activity in Equity markets is directed to 
brokerage houses. In exchange they provide so-called "soft dollars," 
which basically are amounts spent in "research" for identifying prof- 
itable trading opportunities. Soft dollars represent about USD 1 out 
of every USD 10 paid in commissions. Obviously they are costly, and 
it is interesting for an institutional investor to determine whether 
soft dollar inputs are worth being used (and indirectly paid for) or 
not, from a statistical point of view. To address this question, we de- 
velop association measures between what broker-dealers predict and 
what markets realize. Our data are ordinal predictions by two broker- 
dealers and realized values on several markets, on the same ordinal 
scale. We develop a structural equation model with latent variables 
in an ordinal setting which allows us to test broker-dealer predic- 
tive ability of financial market movements. We use a multivariate 
logit model in a latent factor framework, develop a tractable estima- 
tor based on a Laplace approximation, and show its consistency and 
asymptotic normality. Monte Carlo experiments reveal that both the 
estimation method and the testing procedure perform well in small 
samples. The method is then used to analyze our dataset. 

1. Introduction. The point of departure of the present paper is the anal- 
ysis of a dataset provided by the Geneva University pension fund, consisting 
of historical data of financial forecasts from 2 broker-dealers about the mid- 
term evolution of the stock markets in 5 countries and the bond markets in 
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4 zones, respectively. These broker-dealers were asked each quarter during 
6 years to provide their forecasts for each country in terms of market trends 
(stock and bond indices) for the next 6 months. For our purpose they have 
been recorded on an ordinal scale from 1 to 5. In order to decide whether 
the forecasts are valid, they should be compared with the actual evolutions 
of the corresponding markets which were also recorded on the same ordinal 
scale. The issue is to determine whether the forecasts made by the broker- 
dealers are in some sense "near" the realized market evolutions six months 
later. Implicitly we assume that the broker-dealers are small enough so that 
they cannot influence the market. Therefore, there is no causal relationship 
between forecasts and future realizations. We are in a multivariate context 
since the forecasts concern different countries at the same time. Formally, 
the aim is to measure (and test for) the association between two random 
vectors, say, X (the forecasts) and Y (the market realizations), whose size 
p > 2 is the same (4 zones or 5 countries), and whose entries consist of ordinal 
variables corresponding to the forecast and realized market states (values in 
{1, . . . , 5}) for each country, respectively. 

The study of the association between two random vectors is often of 
interest in applied statistics and econometrics. If the multivariate data 
are normal, the canonical correlation coefficient can be used [see, e.g., 
Mardia, Kent and Bibby (1979)]. It is defined as the maximal correlation 
coefficient between any linear combinations of elements of X and any lin- 
ear combinations of elements of Y. When not, for example, when the data 
are collected via questionnaires on scales with a limited number of points, 
the canonical correlation is no longer appropriate. To our knowledge, no 
association measure has been proposed so far to compare multivariate ordi- 
nal random variables. Association measures between univariate categorical 
or ordinal variables have been proposed for a long time now; see, for ex- 
ample, Goodman and Kruskal (1979) and Agresti (1990). For example, the 
Pearson tetrachoric correlation is based on the idea that there exist con- 
tinuous bivariate normal distributions underlying cross-classification tables. 
The tetrachoric correlation is the correlation of the bivariate normal dis- 
tribution having produced the cell probabilities of the table. This idea has 
been extended to association measures between two ordinal variables with 
the polychoric correlation [Olsson (1979)] and between a normal and a bi- 
nary, polytomous or ordinal variable with the polyserial correlation [see Tate 
(1955a, 1955b), Cox (1974) and Lee and Poon (1986)]. 

In this paper we attempt to combine both ideas by constructing hidden or 
latent bivariate normal variables, one for each vector of ordinal variables and 
by defining an association measure which is the correlation between the la- 
tent normal variables. More precisely, we achieve that through the specifica- 
tion of a multivariate multinomial logit (MNL) with latent factors [see, e.g., 
McFadden (1984) for an introduction]. This is done in the spirit of structural 
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equation models (SEM) with latent variables [see, e.g., Aigner et al. (1984) 
for an introduction] and generalized linear latent variable models [see, e.g., 
Bartholomew and Knott (1999) and Skrondal and Rabe-Hesketh (2004) for 
an introduction]. The resulting association measure is similar to the canon- 
ical correlation coefficient in the normal case and similar to the polychoric 
correlation in the univariate case. Our association measure corresponds to a 
model parameter which is easily estimated (together with other parameters) 
using a Laplace approximated maximum likelihood estimator (LAMLE) pro- 
posed by Huber, Ronchetti and Victoria-Feser (2004) for which we develop 
asymptotic properties. Consequently, statistical inference for the association 
measure can be performed using the properties of the estimator. 

In a broader sense, latent variable models encompass a large number of 
models that are frequently used in recent applications. Examples include 
multilevel models (or hierarchical models), generalized linear mixed models 
or Bayesian hierarchical models. For example, Zaslavsky (2007) uses a hier- 
archical model for the analysis of consumer assessments of health care data, 
while Fielding and Yang (2005) use a generalized linear mixed model for the 
analysis of educational achievement data. In these models, latent variables 
are used to model the variability imputed to the observations that lie at 
different levels of clustering (i.e., the random effects) and the emphasis is 
put on the estimation of the variance of the random effects. Latent variables 
are also used to model (time) sequences by means of hidden Markov model 
(HMM), as is done, for example, in Zhou and Wong (2007) for the analysis 
of genomic sequences for short sequence elements. Wu et al. (2007) use a hi- 
erarchical state space model coupled with an HMM to analyze a short time 
course microarray experiment. In these models, the latent variable is the hid- 
den time sequence modeled using a (hidden) Markov chain. Mixture models 
use categorical latent variables (or latent classes) mainly for classification 
purposes. For example, Erosheva, Fienberg and Joutard (2007) propose a 
latent class model to classify elderly Americans into functional disability 
classes according to their scores (able or not able) on different daily activi- 
ties over different periods. Hence, an estimated latent score (here a class) is 
attributed to each observation according to the value of the response vector 
that permits the classification into the different latent classes. In the model 
we propose, the aim is also to attribute a latent score but on a continu- 
ous scale to each ordinal vector of forecasts and actual market realizations 
simultaneously and quantify their correlation. 

Estimation of latent variable models has also seen a substantial activity 
in recent research. Latent variables in these models need to be integrated 
out from the likelihood function which then implies the computation of 
complicated integrals. Bayesian methods like Markov Chain Monte Carlo 
(MCMC) possibly coupled with the EM algorithm, are often used in practice. 
Alternatively, the integrals can be approximated using (adaptive) Gauss 
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quadratures [as implemented, e.g., in GLLAMM in the STATA package; see 
Rabe-Hesketh, Skrondal and Pickles (2002)] or Laplace approximation. For 
the type of model we consider here, Huber, Ronchetti and Victoria-Feser 
(2004) argue that a Laplace approximation of the likelihood function is a 
better approach: the resulting estimator is asymptotically unbiased and fast 
to compute (a key advantage when using resampling methods, for example). 
Finally, it should be noted that standard SEM packages rely on two-stage 
procedures that basically reduce the information given in the sample to an 
estimate of the (multivariate) mean and covariance (using, e.g., polychoric 
correlations). These two-stage procedures are slower and cannot guarantee 
consistency of parameter estimators. 

The paper is organized as follows. In Section 2 we present the datasets 
and the problem at hand, and motivate the construction of an association 
measure between the ordinal random vectors. In Section 3 we develop a 
multivariate multinomial logit (MNL) with latent factors in the framework 
of the generalized linear latent variable model (GLLVM) and propose the 
correlation between the latent variables as the measure of association. We 
then compare in Section 4 this measure to the polychoric correlation and the 
canonical correlation. Estimation and asymptotic properties are investigated 
in Section 5. We rely on the so-called Laplace approximation [De Bruijn 
(1981)] to get a tractable and fast estimation procedure of the latent variable 
model, and show consistency and asymptotic normality of the resulting esti- 
mators. Section 6 is devoted to Monte Carlo experiments aimed at gauging 
the performance in small samples of the estimation method and the testing 
procedure of the measure of association. We gather the empirical results in 
Section 7, while technical details and proofs are relegated to supplemental 
material [see Huber, Scaillet and Victoria-Feser (2009c)]. 

2. The data. The database^ contains the forecasts (in terms of trends) of 
two broker-dealers A and B about the mid-term (6 months) evolution of the 
stock market in five different countries (Switzerland, Germany, France, Great 
Britain and USA) for A and the bond market in four zones (Switzerland, 
Euro Zone, Great Britain and USA) for B. The trends have been clearly 
and precisely defined as corresponding to a given future variation x with: 
X < -10% (strong bear), -10% <x < -5% (bear), -5% < x < 5% (neutral), 
5% < X < 10% (bull), 10% < X (strong bull), for the stock market and x < 
-0.25%, -0.25% <x< -0.10%, -0.10% <x< 0.10%, 0.10% < x < 0.25%, 
0.25% < X, for the bond market. They have been recorded on an ordinal scale 
from 1 to 5. In both cases, we compare the forecasts to the actual returns 



*The datasets are provided as part of supplemental material; see 
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of the corresponding markets six months later. For the stock market, the 
actual trends are measured on the stock indices: S&P500 (US), FTSEIOO 
(UK), CAC40 (FR), DAX (D) and SMI (CH). The sample starts in July 1997 
and finishes in April 2003 with one forecast every quarter (22 observations). 
The observations are sequential in time but since the time gaps are of 3 
months (a quarter), we can assume that there is no serial correlation. 

The data corresponding to broker-dealer A (stock markets) are repre- 
sented in Figure 1 in the form of graphs of the observed versus predicted 
values for each stock market separately. The data corresponding to broker- 
dealer B (bond markets) are represented in the same way in Figure 2. It 
is not clear at all if the predictions are "in general" in accordance with 
the corresponding actual market values. It seems, however, that roughly the 
broker-dealers are in general able to follow the trends, except that they tend 
to underestimate their magnitude. 

This type of representation gives a first idea on the performance of the 
broker-dealers, but a more formal approach, in the form of an indicator 
(and its variability), is more appropriate. We need to compare the predictive 
performance of several broker-dealers for these markets from an institutional 
point of view. If the data were recorded on a normal scale, a canonical 
correlation could be used. For an ordinal scale, an appropriate measure of 
correlation is the polychoric correlation, but it is only defined between pairs 
of (ordinal) variables. Hence, at least for the problem at hand, we need an 
association measure between two vectors of ordinal variables. 

Measuring the predictive ability of broker-dealers is an important issue 
because institutional investors make a large portion of overall trading volume 
in Equity markets, and much of this trading activity is directed to brokerage 
houses who execute trades. In exchange for directed trades, most of the bro- 
kerage houses provide so-called "soft dollars." Soft dollar arrangements are 
arrangements under which products or services other than execution of secu- 
rities transactions are obtained by an institutional investor from or through a 
broker-dealer in exchange for the placement of his orders [see Blume (1993), 
Johnsen (1994) and the Securities and Exchange Commission (1998) for the 
detailed definition, history and law related to soft dollars]. These arrange- 
ments are best thought of as ways of subsidizing the research inputs that in- 
vestors use to identify profitable trading opportunities. In contrast to "hard 
dollars" (actual cash), which have to be reported on investor books, soft 
dollars are incorporated into brokerage fees and the expenses investors pay 
for needs not be reported directly. Soft dollars arrangements were first devel- 
oped as a means of competition among brokers. With broker-dealers being 
unable to compete based on commission rates fixed by regulation, complex 
arrangements to provide equity research and services became a primary tool 
of differentiation among brokerage houses. They also provided a way to in- 
vestors to recapture a portion of the high commissions they were required 
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Fig. 1. Observed (dashed line) and predicted (solid line) market values by broker-dealer 
A, for five different stock markets. 



to pay. The US Security Exchange Commission (SEC) aboHshed fixed rate 
commissions on May 1, 1975. Shortly after industry participants expressed 
concern that soft dollar practices would be viewed as a violation of a manager 
fiduciary obligation to place the client interest above his own. In response, 
US Congress passed Section 28(e) of the Securities and Exchange Act to 
provide a "safe harbor" and protect managers of being accused of breaching 
their duty. This legal acceptance explains why the industry still offers such 
arrangements nowadays. US regular surveys about the size of the soft dollar 
industry are conducted by Greenwich Associates. Their 2007 survey of 229 
financial institutions indicates that soft dollar commissions totaled almost 
USD 723 million in 2007 down from USD 970 million in 2006. This repre- 
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Fig. 2. Observed (dashed line) and predicted (solid line) market values by broker-dealer 
B, for four different zones. 

sents about USD 1 out of every USD 10 (10%!) paid in commissions by those 
firms involved. As recently as 2004, more than 80% of institutions used soft 
dollars; by 2007 that proportion is still around 60%. On the contrary, less 
than a third are currently buying equity research and services with hard 
dollars. Obviously soft dollar practices are costly and widespread, and it is 
therefore important for an institutional investor to determine whether these 
soft dollar inputs are worth being used (and indirectly paid for) or not, from 
a statistical point of view. Indeed, even if soft dollars vary by account size, 
annual turnover and asset class, they still represent a meaningful portion of 
the overall annual cost of actively managed equity portfolios. They directly 
impact the overall performance of funds, and thus are to be monitored. 

After presenting the model in Section 3, we will assess in Section 7 the 
predictive ability of the broker-dealers by means of our association measure. 

3. A SEM for multivariate measure of association. Recall that the aim is 
here to measure an association between a set X = {X^^\ . . . , X^^^)' of mani- 
fest variables and another set Y = {Y^^\ . . . , y(p))' of manifest variables. Let 
U = (X', Y')' and let Fx and Fy be latent variable vectors of dimension 
mx X 1 and my x 1 with my^mx <p. Let also F = (l,F'Y,Fy)' = (1,F'^2))') 
such that F(2) ~ A^(0,R), R being a correlation matrix. If the manifest 
variables are normal, we could use a SEM for normal vectors given by 

(1) C/«|F^;iJJ-Ar(A;F,</>2), 1 = 1,. ..,2p 
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Consequently, R : 



has a single unknown parameter p which can be 



with, for I = 1, . . . ,p, A/ = (al^\ O')' the intercept and factor loadings 
for the manifest variables X and (()f their residual variance, and, for / = 

p+ l,...,2p. A; = {oy ^\o',(3y ^^'y the intercept and factor loadings for 
the manifest variables Y and (pf their residual variance. In principle, the 
dimension of the latent spaces mY,mx could be greater than 1, but for the 
purpose of building a single association measure, the choice is my = mx = 1- 

interpreted as an association measure between X and Y (see also Section 
4). 

The normalization (unit variance) of the latent variable F(2) is necessary 
for identification purposes. Indeed, since the latent variables in model (1) 
are multiplied by the factor loadings, an increase (decrease) of the variance 
of F(2) cannot be distinguished from a simultaneous decrease (increase) of 
the factor loadings. Moreover, since we are interested in computing the cor- 
relation between the latent factors Fx and Fy, this normalization has no 
influence on the association measure. 

Incorporating additional factors (i.e., increasing my and mx) does not 
raise any difficulties; however, the interpretation in terms of measure of 
association becomes more difficult. It is therefore important to check the 
model fit when assuming mx = my = 1; this will be done in our empirical 
illustration (see Section 7). Finally, we also note that the dimensions of X 
and Y need not to be equal. 

Model (1) provides measures of association between vectors of normal 
variables. It can be generalized to the family of exponential distributions 
for vectors Z = (X', Y')' of (nonnecessarily normal) manifest variables with 
probability distribution function: 

,2) „(Z")|r)=exp{ "'^i^'"';;^'"'^i^» +c(Z»-',^,)}, 

1 = 1,. ..,2p, 

where «(A;F) is the so-called canonical parameter, 6(ti(A;F)) and c{z'^^\(j)i) 
are specific functions whose form depends on the particular exponential 
distribution, and (f)i is a scale parameter [see McCullagh and Nelder (1989)]. 
Except for the normal case, the expectation i?[Z(')|F] is not a linear function 
of F, but is linked to the linear predictor through a link function f as 

v{E\Z^^\'F]) = X'{F. 

We further have that n(AjF) = A;F when we choose the so-called canonical 
link function for v. This model belongs to the class of Generalized Linear 
Latent Variables Model (GLLVM) which has been proposed by Moustaki 
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(3) f{z) 



(1996) and Moustaki and Knott (2000) under an assumption of indepen- 
dence between the Gaussian latent variables (diagonal R). This type of 
modeling can be viewed as an extension of the usual Generalized Linear 
Models approach [McCullagh and Nelder (1989)] to the latent factor frame- 
work. 

The conditional independence of the manifest variables Z^^^ given the 
latent ones is assumed, so that the conditional joint density of the manifest 
variables is n?£ifl'/(-^^'^|F) and their marginal joint distribution is 

-2p 

.1=1 

with ip being the A^(0,R) density function. 

Because of the nature of the data at hand, we need to develop here- 
after the case of ordinal variables, that is, ordered categorical variables. 
Let Z^'^lF follow a multinomial distribution with possible values (or cate- 
gories) going from 1 to qi. In the following we opt for a cumulative logit 
formulation [see Agresti (1990) for the advantages of this formulation over 
other ones, and Joreskog and Moustaki (2001) for a comparison of different 
approaches in the framework of factor analysis with ordinal data] to ac- 
count for the ordered nature of the categorical data. Let Pis = P[Z^^^ < s\F], 
s = 1, . . . ,qi,he the conditional cumulative distribution functions. The quan- 
tity log{Pis/{l — Pis)) is the log-odds of falling into or below a category s 
versus falling above it for the manifest variable /. It is used in the logit 
link between the linear predictor and the conditional cumulative probability 
distribution: 

(4) i.{Pis)=log(-^)=X'isF, 



I- Pis, 

where A/, = (aj''\/3j^', 0')' or A;, = (af"^''\ 0', /3?"*'^')' depending on the 
value of I. The index s in the a's indicates that each intercept depends not 
only on the manifest variable / but also on the category s. The constraint 
for each manifest variable that the p slope coefficients (the beta's) in A/^ 
does not depend on the category s is known as the proportional-odds as- 
sumption, and essentially allows us to reduce the number of parameters to 
be estimated. The intercepts take the interpretation of thresholds and are 
monotonic in the sense that the lowest category receives the lowest thresh- 
old, and so on. They represent the log-odds of falling into or below category 
s when all latent variables are nil, while a given positive slope leads to an 
increase on the log-odds of falling into or below any category associated 
with a one unit increase in the corresponding latent variable. A positive 
slope indicates thus an increase in the odds themselves, and higher proba- 
bilities for the manifest variable to take low values. For identification pur- 
poses, the highest threshold is set equal to infinity by convention, which 



10 



p. HUBER, O. SCAILLET AND M.-P. VICTORIA-FESER 



means that we only need to estimate the qi — 1 threshold for the mani- 
fest variable /. With all these restrictions, the model is fully identifiable. 
In general, the thresholds can be assumed to be different for each manifest 
(ordinal) variable. However, when the ordinal variables take values under 
the same measurement unit (e.g., percentages), we can constrain the thresh- 
olds to be equal for all manifest variables, that is, a^x^^ = Oy = a^*^ 
for all /. This is a suitable constraint for the analysis of our data (see Sec- 
tion 7). 

The scale parameter is here equal to 1, while the canonical parameter 
is not linear in the latent factors (since we do not use the canonical link 
function), but equal to 

n(ALF)=log^ ^' 



Pi,s+i - Pis 



while 



b{u{\lF))=log{l + eMu{X'ism 

Pl,s+l 



log 



Pl,s+1 - Pis 



and c{Z^^\(j)i) = 0. The conditional distribution of the manifest variable is 
given by 



p \ '-(^'''<s) / p _p \ t(ZW<s+l)-t(Z(0<s) 
/s \ / J~^l,s-\-l ^Is \ 



=1 
-1 



(5) =n 



■.Pl,s+lJ \ Pl,s+l 



: exp M^^'^ < ^XKsP) - ^{Z^'^ < s + 1)6(^(ALF))] 



=1 



where l{Z^^^ = s) = 1 if Z(') = s and otherwise, and i{Z^^^ < s) = 1 if Z^^^ < 
s, and t(Z(') < s) = otherwise. 

Note that we could use a probit link instead of the logit function. In prac- 
tice, however, the difference is very small since these two link functions are 
very close (|'I'(2;) — ^'(1.7x)| < 0.01, Vx, where ^ is the logistic distribution 
function and <I> the normal cumulative distribution function); see, for exam- 
ple, Lord and Novick (1968). In the regression model (i.e., F is observed and 
Z is the univariate ordinal response variable), McCullagh and Nelder (1989) 
use the same approach to link the explanatory variable to the first moments 
of the response variable. Finally, let us remark that a specification in terms 
of latent variables is a usual way to reduce the complexity of multinomial 
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model calculation [see McFadden (1984), page 1419], and to achieve a rela- 
tive parsimony in the modeling. This is even more relevant, if not inevitable, 
in a multivariate framework. 

4. Relationship with other measures of association. In this section, we 
compare the measure of association p with two other measures, namely, the 
polychoric correlation which associates two ordinal random vectors and the 
canonical correlation which associates two vectors of normal variables. 

There are many other association measures between two ordinal variables 
such as Goodman and Kruskal (1954) and Gamma and Kendall (1945) tau-6 
[see also Agresti (1984)], but the polychoric correlation is the most similar 
to our measure. 

The polychoric correlation is based on the assumption of the existence of 
a vector [X* ,Y*) of bivariate normal variables with zero mean, unit vari- 
ance and correlation p. Instead of observing directly [X*,Y*), we observe 
the vector {X,Y) of ordered multinomial variables, taking ordered values 
in, say, {l,...,gx} and {l,...,gy}, respectively. The observed variables 
are linked to {X* ,Y*) by means of a set of thresholds a^^^Oy^^s^ = 
1, . . . sy = 1, . . . ,gY through P{X <sx,Y< sy) = P{X* < qJ^^F* < 
ay^^). Using the bivariate normal as the (indirect) model, we can estimate 
the so-called polychoric correlation p by using the likelihood function given 
in Huber, Scaillet and Victoria- Feser (2009a) [see also Olsson (1979)]. With 
the GLLVM with conditional density (5) when p = 1 and taking the probit 
link instead of the logit link, we get a likelihood function that shows that 
although the estimator of the correlation p is different from the polychoric 
correlation [see Huber, Scaillet and Victoria- Feser (2009a)], they are directly 
comparable. Indeed, for both estimators an assumption is made about the 
distribution of the underlying (or factor) variables, namely, normality, and 
the thresholds need to be estimated together with the correlation coeffi- 
cient. With the GLLVM, an additional centering (3xFx (PyFy, resp.) is 
also needed. However, the advantage of the GLLVM approach is that it can 
be easily extended to multivariate X and Y, which is not the case with the 
polychoric correlation. 

On the other hand, the canonical correlation can be used to assess 
the association between two multivariate normal variables [see, e.g., 
Mardia, Kent and Bibby (1979)]. It is defined as the maximal correlation co- 
efficient between any linear combinations of X and any linear combinations 
of Y. For a moment suppose that (X', Y')' is distributed as a multivariate 
normal random variable with mean (/Xjf,^y)' and covariance matrix 

VI _ f'^XX ^XY \ 

^ — V v' V )■ 
^XY ^YY 
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The canonical correlation coefficient is then defined by 



(6) 

b^^Sxxb^fbyXlyyby 
where b^^^ and by are the solutions to the maximization problem 

/„N b^Exyby 
(7) max 

bx,by y^b^SxxbxbV5]yyby 

As noticed in Section 3, in the GLLVM with normal manifest variables and 
with mx = mriY = 1, the correlation p between the latent variables can be 
interpreted as an association measure between X and Y. We may therefore 
ask the next question: what is the link between the latter and the standard 
canonical correlation coefficient pc given in (6)? The following proposition 
shows that p can be rewritten in an analogous form to (6), namely, a corre- 
lation between linear combinations of X and linear combinations of Y but 
with a modified covariance matrix = S + -0 with rj) = diag((/)^). 



Proposition 1. For the SEM (1), the correlation coefficient p between 
the latent variables is given 

(8) p- Px^XYf^Y 



' P'x'^*xx^x(^y'^yyPy 
with (3x = . . /3y = . . and S* = S + ^. 



Equation (8) can be interpreted as the correlation between any linear 
combinations of elements of X and any linear combinations of elements of 
Y when the covariance structure is accounted for measurement error in the 
manifest variables via tp. The correlation coefficient p is thus different from 
the canonical correlation coefficient pc by construction. The advantage of 
defining p instead of pc as an association measure between two vectors of 
random variables is that p can be easily generalized to the case of nonnormal 
variables, like, for example, ordinal variables in our case. 

Keller and Wansbeek (1983) mention that the canonical coefficients can 
be obtained when tp has a particular form. They also use the SEM in (1) 
with categorical variables, and show the relationships between the resulting 
models and Correspondence Analysis. Our approach here is different since 
in the nonnormal case (e.g., ordinal) we change the SEM model to take into 
account the specificity of the data. 
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5. Estimation and asymptotic properties. In this section we propose an 
estimator for the association measure p induced by the GLLVM and de- 
rive its asymptotic properties. Traditionally, the parameters of a GLLVM, 
when the manifest variables are ordered ordinal, are estimated by means 
of a two-stage approach. Polychoric correlations are estimated between all 
pairs of ordinal variables and used in the construction of the correlation ma- 
trix between the manifest variables [Muthen (1984), Poon and Lee (1987)]. 
Then a traditional factor analysis (with or without correlated latent vari- 
ables) is performed. Inference on the GLLVM parameters is performed using 
the asymptotic properties of the polychoric correlation [see, e.g., Joreskog 
(1994)]. This two-stage approach is based on the assumption that the under- 
lying distribution of the manifest variables is normal. When this is not the 
case, the resulting estimators can be biased, essentially because the informa- 
tion in the sample is reduced to estimates of the first two moments of the mul- 
tivariate distribution of the ordinal variables (i.e., their mean and correla- 
tion matrix). By means of simulations, Huber, Ronchetti and Victoria- Feser 
(2004) show the bias effect of this two-stage estimation procedure on esti- 
mates of GLLVM with mixtures between binary and normal variables, while 
Elefant-Yanni, Huber and Victoria-Feser (2004) examine this effect on esti- 
mates of GLLVM with ordinal variables. 

To describe the estimation procedure, let z = [zi , . . . , z„]', with Zj = [zf'^ , . . . , 
zf'^'^W n the sample size, and 2p the number of manifest variables. As the 
marginal distribution of the observed variable must be integrated out from 
the conditional distributions g{Z^^^\F) given by (3), we use a Laplace ap- 
proximation [see De Bruijn (1981)] to approximate the likelihood function 
of the sample as it has been done in Huber, Ronchetti and Victoria-Feser 
(2004) for other types of variables. 

The Laplace approximation to integrals goes back to the original work 
of Laplace. This technique is widely used in mathematics; see, for example, 
De Bruijn (1981). In statistics, it has been used successfully to approximate 
posterior distributions in Bayesian statistics [see, e.g., Tierney and Kadane 
(1986)] and in relation to saddlepoint approximations [Field and Ronchetti 
(1990)]. 

Let h : M"^ — > M be a function which satisfies the following conditions: it is 
continuous and has a global maximum in x, its first and second derivatives 
exist in a neighborhood of x and dh(x.)/dx = and H(x) = d'^h{x)/dxdx' , 
the Hessian matrix, is such that — H(x) is positive definite. Moreover, /i(x) 
is sharply peaked in the neighborhood of x, that is, two positive scalars b 
and c exist such that 

/i(x) < —b if |x — x| > c. 
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Then, 

(9) J e*^W dx = (27r)'"/2 det(-H(x))"i/2^-'^/2 exp(t/i(i))(l + 0{t~^)), 

t — > oo. 

Equation (9) is obtained by an expansion of h{x) about its maximum x: 
J e*''^ dx«y" exp(^t/i(i)+i^/i(x)(x-x) + ^t(x-i)H(i)(x-x)'^ dx 

= exp(i^(x)) J exp^^i(x — x)H(x)(x — x)'^ dx 

= (27r)'"/2 det(-H(x))~i/2^-"/2 exp(i/i(x)). 

Letting A denote the vector of all loadings and thresholds, and R the 
correlation matrix of the latent factors, the approximated log-likelihood / 
for a model with ordered multinomial distributed manifest variables is [see 
Huber, Scaillet and Victoria-Feser (2009c)] 

^ / 1 ^1 
/(A,R|z) =^ --logdet(r(A,R,F,)) - -logdet(R) 

i=l \ 

2p qi-1 

+ E E [^(^'^ < ^H^'ls^^) 
1=1 s=l 

(10) 

- Lizf'> <s + l) log(l + expu{\lFi))] 



where r(A,R, Fj) is a correction matrix that comes from the Laplace ap- 
proximation, Fj(2) = [^'iX'-^iy]' ™d Fi = [1,F^^2)]' is the estimator of the 
latent score for the ith observation which is given by the implicit equation 

Fi(2) :=F,(2)(A,R,Zi) = E E(''(4'^ ^^)^M+l(ALF^) 

(11) 

-i(zf <s + l)Pi,(ALF,))RAi(2), 

where A;(2) is Xis without its first element. 

Huber, Ronchetti and Victoria-Feser (2004) point out that Fj(2) can be 
seen as the MLE of Fj(2) • The Laplace Approximated MLE (LAMLE) of the 
models parameters are obtained from the optimization of /, whose derivatives 
can be computed explicitly, but are omitted here for sake of space. Hereafter, 
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we establish the consistency and asymptotic normality of the LAMLE 6 of 
= (A', vech(R)')', where vech(A) is the stack of the elements on and below 
the diagonal of A. 

Proposition 2 (Consistency). Let 6£@. If Q is compact, 
e^Oo = aigmax E[l(e)]. 

Note that the empirical approximated likelihood is here too complex to 
be shown to be concave in 6. Under concavity of the objective function, 
compactness can be replaced by the assumption of Oq being an element of 
the interior of a convex set Q [see, e.g.. Theorem 2.7 of Newey and McFadden 
(1994)]. 

Proposition 3 (Asymptotic normality). IfQ is compact, 6q G interior (0), 
and Jo = E\d'^l{6Q) / 89 d6'\ is nonsingular, 

with lo = E[dT{eo)/dedI{eo)/de']. 

Alternative estimators have been proposed in the framework of Gener- 
alized Mixed Linear Models, such as the McGilchrist (1994) best linear 
unbiased prediction (BLUP) based on the h-likelihood of Lee and Nelder 
(1996), or the Green (1987) penalized quasi-likelihood (PQL) [see also Bres- 
low and Clayton (1993)]. Huber, Ronchetti and Victoria- Feser (2004) show 
that these estimators are all equal but differ from the LAMLE. 

The results of Proposition 3 could, in principle, be used for inference when 
the sample sizes are large. When this is not that case, it is more suitable to 
use other techniques. For the correlation estimator p, we propose to use the 
transformation function r] introduced by Fisher (1915) that stabilizes the 
variance of the estimator: 

7?(p) =tanh"i(/)) = 
and r] is approximately normal 

.(p)~aa(.„-1^), 

with Up = tanh~^(/9) + 2(n-i) ■ ^ discussion about the Fisher transformation 
can be found in Efron (1982). In practice, we compute the variance of fj, 
which is simply (n — 3)~^, calculate its confidence interval, and transform it 
back to a confidence interval for p. 
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For the other parameter estimators, we use a parametric bootstrap: first, 
we calculate the estimators from the observed sample and then, we gener- 
ate 1000 new samples under the estimated distributions to get new estima- 
tors. We find the biases and endpoints of the confidence intervals using a 
bias-corrected acceleration (BCa) technique as described in Efron (1987), 
Efron and Tibshirani (1993) and Shao and Tu (1995). 

6. Monte Carlo experiments. In order to evaluate the performance of 
our model and our estimator in finite samples, we have performed a simu- 
lation study.^ We consider the model (3) with p = 10, equal thresholds and 
parameter values given in Tables 1-3. With these two sets of parameters, we 
also choose different values for the correlation coefficient, namely, p = —0.5,0 
and 0.5. The first set of parameters (SI) was chosen to match one of the 
real examples analyzed in Section 7 and the other (5*2) to reflect what is 
sensible in practice, that is, a conservative attitude implying large proba- 
bilities associated to small or no changes. For each set of parameters, we 
simulated 500 samples of size n = 30, and computed the LAMLE of A and 
the association measure p. The distribution of the sample bias estimates for 
the thresholds and loadings are presented for each estimator in the form of 

Table 1 
Thresholds for simulation SI 



Cumulative 
Thresholds probabilities 

-4.60 0.01 

-2.94 0.05 

0.85 0.70 

4.60 0.99 



Table 2 
Thresholds for simulation S2 



Cumulative 
Thresholds probabilities 

-2.19 0.10 

-1.39 0.20 

1.39 0.80 

2.19 0.90 



^The C code is provided as part of the supplemental material; see 
Huber, Scaillet and Victoria- Feser (2009b). 



MULTIVARIATE PREDICTORS OF FINANCIAL MARKET MOVEMENTS 17 



Table 3 
Loadings for simulation SI and 
S2 



Latent 1 


Latent 2 


1.60 


0.00 


1.75 


0.00 


1.70 


0.00 


1.30 


0.00 


1.50 


0.00 


0.00 


5.00 


0.00 


9.00 


0.00 


9.00 


0.00 


5.00 


0.00 


6.00 



: = 



Fig. 3. Distributions of threshold bias estimates for simulation SI with p — 0.5. 



boxplots in Figures 3 and 4 (second latent variable) for a correlation p = 0.5 
(for the other values of p and the other loadings we find similar results). Fig- 
ure 5 shows the boxplots for the estimated correlation p under the parameter 
set 51 (for the other set, results are similar). We can see that even for a 
relatively small sample size (given the size of the model), the performance 
is very good in that there is no apparent bias for all parameters, including 
P- 

We have also studied the small sample performance of the probability 
coverage of 95% confidence intervals for p computed with the Fisher trans- 
formation, and have found a probability coverage of 84.9%. 
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to - 



c\ - 




Fig. 4. Distributions of loading estimates (second latent variable) for simulation SI with 
p = 0.5. 

7. Data analysis. The estimated loadings for both broker-dealers are 
given in Tables 4 and 5 with biases and 95% confidence intervals, all com- 




Fig. 5. Distributions of correlation estimates for simulation SI with, respectively, 
p = -0.5, p = 0.0 and p = 0.5. 
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puted via a parametric bootstrap. The estimated correlation between both 
latent variables for broker-dealers A and B is given in Table 6. The scores 
of the latent variables for both broker-dealers are displayed in Figures 6 and 
7. 

The study of the correlation estimates (see Table 6) indicates whether the 
broker-dealer forecasts match the actual market evolution. The correlation 
for both broker-dealers are significantly positive. We can therefore conclude 
that the forecasts are relatively accurate. Alternatively, we can look at the 
latent scores Fx^ and Fy. and see graphically how they evolve. For broker- 
dealer A, they are given in Figure 6 and for broker-dealer B in Figure 7. For 
both broker-dealers, the evolution of the two lines (predicted and actual) 
is pretty similar, thus reflecting the fact that the predictions on the five 
stock markets and on the four bond markets are in phase with the actual 
evolutions of the latter. This reflects again the relatively accurate ability of 
the broker-dealers to predict the markets. 

Tables 4 and 5 present the estimated loadings for broker-dealers A and 
B, respectively. They give another type of information about the behavior of 
the broker-dealers. Indeed, the correlation reflects the ability of the broker- 
dealers to predict the changes in markets directions, but not necessarily the 
range of the changes. The latter can be inferred from the loadings because 
they act as a multiplicative factor of the latent variables. In other words, the 
latent variables give the directions of the market moves, while the loadings 
give the (average) ranges of these moves. In Tables 4 and 5 one can see 
that the loadings for the actual markets are systematically higher than the 
corresponding loadings related to the forecasts. This difference is certainly 
due to the forecasts being in general too conservative: although the direction 
of the movements are correctly predicted, their range is underestimated in 
all markets by the broker-dealers. We have already noticed this feature when 
analyzing the raw data in Figures 1 and 2. 

Table 4 

Estimated loadings for broker-dealer A. The biases, lower (I0.95 ) and upper (wo. 95 ) 
confidence bounds are computed with a parametric bootstrap 



Predicted Observed 



Market 


Estimator 


Bias 


^0.95 


Wo. 95 


Estimator 


Bias 


^0.95 


Wo. 95 


CH 


1.596 


-0.078 


0.441 


3.071 


5.557 


-0.280 


3.716 


10.069 


D 


1.762 


-0.063 


0.456 


3.197 


8.925 


-1.091 


5.399 


17.079 


F 


1.725 


-0.066 


0.411 


3.407 


8.982 


-1.102 


5.493 


17.709 


UK 


1.286 


-0.035 


0.207 


2.724 


4.980 


-0.162 


3.165 


8.386 


USA 


1.506 


-0.103 


0.302 


3.023 


6.222 


-0.442 


4.178 


11.628 



20 



P. HUBER, O. SCAILLET AND M.-P. VICTORIA-FESER 



Table 5 

Estimated loadings for broker-dealer B. The biases, lower (I0.95 ) and upper (uo.95 ) 
confidence bounds are computed with a parametric bootstrap 



Predicted Observed 



Market 


Estimator 


Bias 


'o.95 


Wo. 95 


Estimator 


Bias 


^0.95 


Wo. 95 


CH 


0.632 


0.038 


-0.985 


1.824 


2.971 


-0.082 


1.836 


5.608 


EU 


0.886 


-0.029 


-0.912 


2.200 


5.869 


-1.049 


2.554 


10.014 


UK 


0.544 


0.029 


-0.930 


1.732 


5.419 


-0.897 


2.662 


9.234 


USA 


1.219 


-0.105 


-0.921 


2.665 


7.180 


-1.647 


3.327 


14.059 



Table 6 

Estimated correlations between both latent variables 





Broker-dealer A 






Broker-dealer B 




Estimator 


Bias Zo.95 


Wo. 95 


Estimator 


Bias Z0.95 


Wo. 95 


0.398 


-0.034 0.013 


0.722 


0.320 


0.069 0.027 


0.691 





A A "°' / Vi 
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5> 
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T 
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Fig. 6. Estimated scores for broker-dealer A. The plain line is the forecast and the dotted 
line the actual level of the stock market. 
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SUPPLEMENTARY MATERIAL 

Supplement A: Datasets on the predictions by two broker dealers and re- 
alized values on several markets (DOL 10.1214/08-AOAS213SUPPA; .zip). 
In this supplement, we provide a zip file containing two Excel files for the 
predictions and the realized market values of the two broker-dealers ana- 
lyzed in this paper. 

Supplement B: C code for data analysis and simulations 

(DOI: 10.1214/08-AOAS213SUPPB; .zip). In this supplement we provide a 
zip file containing the source code in C for the programs used to analyze the 
datasets and to perform the simulation study in this paper. 

Supplement C: Technical developments and proofs 

(DOI: 10.1214/08-AOAS213SUPPC; .pdf). In this supplement we provide 
the technical developments for the likelihood comparison between the poly- 
choric correlation and the GLLVM of Section 4, the development of the 
LAMLE for ordered multinomial distributed manifest variables as a com- 
plement of Section 5 and the proofs of Propositions 1-3. 
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